45 research outputs found
Using auditory classification images for the identification of fine acoustic cues used in speech perception
International audienceAn essential step in understanding the processes underlying the general mechanism of perceptual categorization is to identify which portions of a physical stimulation modulate the behavior of our perceptual system. More specifically, in the context of speech comprehension, it is still a major open challenge to understand which information is used to categorize a speech stimulus as one phoneme or another, the auditory primitives relevant for the categorical perception of speech being still unknown. Here we propose to adapt a method relying on a Generalized Linear Model with smoothness priors, already used in the visual domain for the estimation of so-called classification images, to auditory experiments. This statistical model offers a rigorous framework for dealing with non-Gaussian noise, as it is often the case in the auditory modality, and limits the amount of noise in the estimated template by enforcing smoother solutions. By applying this technique to a specific two-alternative forced choice experiment between stimuli " aba " and " ada " in noise with an adaptive SNR, we confirm that the second formantic transition is key for classifying phonemes into /b/ or /d/ in noise, and that its estimation by the auditory system is a relative measurement across spectral bands and in relation to the perceived height of the second formant in the preceding syllable. Through this example, we show how the GLM with smoothness priors approach can be applied to the identification of fine functional acoustic cues in speech perception. Finally we discuss some assumptions of the model in the specific case of speech perception
Recommended from our members
Dual Coding of Frequency Modulation in the Ventral Cochlear Nucleus.
Frequency modulation (FM) is a common acoustic feature of natural sounds and is known to play a role in robust sound source recognition. Auditory neurons show precise stimulus-synchronized discharge patterns that may be used for the representation of low-rate FM. However, it remains unclear whether this representation is based on synchronization to slow temporal envelope (ENV) cues resulting from cochlear filtering or phase locking to faster temporal fine structure (TFS) cues. To investigate the plausibility of those encoding schemes, single units of the ventral cochlear nucleus of guinea pigs of either sex were recorded in response to sine FM tones centered at the unit's best frequency (BF). The results show that, in contrast to high-BF units, for modulation depths within the receptive field, low-BF units (<4 kHz) demonstrate good phase locking to TFS. For modulation depths extending beyond the receptive field, the discharge patterns follow the ENV and fluctuate at the modulation rate. The receptive field proved to be a good predictor of the ENV responses for most primary-like and chopper units. The current in vivo data also reveal a high level of diversity in responses across unit types. TFS cues are mainly conveyed by low-frequency and primary-like units and ENV cues by chopper and onset units. The diversity of responses exhibited by cochlear nucleus neurons provides a neural basis for a dual-coding scheme of FM in the brainstem based on both ENV and TFS cues.SIGNIFICANCE STATEMENT Natural sounds, including speech, convey informative temporal modulations in frequency. Understanding how the auditory system represents those frequency modulations (FM) has important implications as robust sound source recognition depends crucially on the reception of low-rate FM cues. Here, we recorded 115 single-unit responses from the ventral cochlear nucleus in response to FM and provide the first physiological evidence of a dual-coding mechanism of FM via synchronization to temporal envelope cues and phase locking to temporal fine structure cues. We also demonstrate a diversity of neural responses with different coding specializations. These results support the dual-coding scheme proposed by psychophysicists to account for FM sensitivity in humans and provide new insights on how this might be implemented in the early stages of the auditory pathway
La mie de pain n’est pas une amie: une étude EEG sur la perception de différences infra-phonémiques en situation de variations
International audienceWe examined electrophysiological correlates of listener’s sensitivity to fine acoustic cues in intra-speaker variability conditions in order to test the relevance of such cues for the speech perception system. For this purpose, a modified oddball paradigm has been used with syllables such as French homophones la and l’a, and in a second experiment with longer sequences such as la mie and l’amie, both /lami/. The main result of this study was the observation of a mismatch negativity (MMN) for homophone deviants. Speech perception system is thus sensitive to subphonemic differences between homophone sequences despite the speech variability. Fine acoustic cues are robust enough to play a role in speech processin
A comparative study of eight human auditory models of monaural processing
A number of auditory models have been developed using diverging approaches,
either physiological or perceptual, but they share comparable stages of signal
processing, as they are inspired by the same constitutive parts of the auditory
system. We compare eight monaural models that are openly accessible in the
Auditory Modelling Toolbox. We discuss the considerations required to make the
model outputs comparable to each other, as well as the results for the
following model processing stages or their equivalents: Outer and middle ear,
cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear
nucleus, and inferior colliculus. The discussion includes a list of
recommendations for future applications of auditory models.Comment: Revision 1 of the manuscrip
Listening strategies and inter-individual variability in stop consonant perception
International audienc
Noise stimuli (ACI experiment)
<p>Noise stimuli involved in the Auditory Classification Image experiment (gaussian noise). 10.000 stimuli for each participant. wav format, 48 kHz</p
Data from all participants (ACI experiment)
<p>Each .mat file contains all behavioral data from one single participant, across 10.000 trials :<br>
- n_signal corresponds to the presented speech signal (1:Alda, 2:Alga, 3:Arda, 4:Arga)<br>
- correct_answer : 1 if the participant correctly identified the target (da or ga), 0 otherwise.<br>
- SNR : Signal to noise ratio at which stimuli were presented<br>
- date : date of the corresponding trial (year, month, day, h, m, s)<br>
- stim_order : random order of presentation of noise sounds.</p
Oscillations corticales et intelligibilité de la parole dégradée
International audienceLa méthode des potentiels évoqués a permis de caractériser différentes composantes associées au traitement de la parole. Cependant il n'existe pas aujourd'hui de marqueur cortical témoignant du succès de l'accès lexical lors de la compréhension de la parole. Le but de cette étude est donc de développer un protocole expérimental et une analyse statistique des signaux électroencéphalographiques, afin d'identifier des clusters temps-fréquence dans l'activité oscillatoire corrélant avec l'intelligibilité de stimuli paroliers. Pour mettre en évidence cet effet, nous avons présenté aux sujets des mots dégradés par noise-vocoding avant et après une courte phase d'apprentissage perceptuel. Nous avons comparé les activités oscillatoires apparaissant en réponse à des stimuli évalués comme " intelligibles " et " inintelligibles " par les participants (N=12). Nous sommes ainsi parvenus à mettre à jour trois activités avec des topographies et des séquences spécifiques liées au succès de l'accès lexical
Speech reductions cause a de-weighting of secondary acoustic cues
International audienceThe ability of the auditory system to change the perceptual weighting of acoustic cues when faced with degraded speech has long been evidenced. However, the exact changes that occur remain mostly unknown. Here, we proposed to use the Auditory Classification Image (ACI) methodology to reveal the acoustic cues used in natural speech comprehension and in reduced (i.e. noise-vocoded or re-synthesized) speech comprehension. The results show that in the latter case the auditory system updates its listening strategy by de-weighting secondary acoustic cues. Indeed, these are often weaker and thus more easily erased in adverse listening conditions. Furthermore our data suggests that this de-weighting does not directly depend on the actual reliability of the cues, but rather on the expected change in informativeness
Assessment of individual listening strategies in amplitude-modulation detection and phoneme categorisation tasks
International audienc